\(\newcommand{\mathds}[1]{\mathrm{I\hspace{-0.7mm}#1}}\) \(\newcommand{\bm}[1]{\boldsymbol{#1}}\) \(\newcommand{\bms}[1]{\boldsymbol{\scriptsize #1}}\) \(\newcommand{\proper}[1]{\text{#1}}\) \(\newcommand{\pE}{\proper{E}}\) \(\newcommand{\pV}{\proper{Var}}\) \(\newcommand{\pCov}{\proper{Cov}}\) \(\newcommand{\pACF}{\proper{ACF}}\) \(\newcommand{\I}{\bm{\mathcal{I}}}\) \(\newcommand{\wh}[1]{\widehat{#1}}\) \(\newcommand{\wt}[1]{\widetilde{#1}}\) \(\newcommand{\pP}{\proper{P}}\) \(\newcommand{\pAIC}{\textsf{AIC}}\) \(\DeclareMathOperator{\diag}{diag}\)

Lecture notes for MA52112 (Statistics for Data Science)

Author

Karim Anaya-Izquierdo (based on notes by Vangelis Evangelou)

Published

December 5, 2025

Overview of Statistics for Data Science

Synopsis

In this unit you will develop your understanding of the basic theory of probability and statistics and recognise when this theory can be applied in practice.

Learning outcomes

By the end of the unit you will be able to:

  • perform elementary mathematical operations in probability and statistics

  • translate real-world problems into a probabilistic or statistical framework

  • solve statistical problems in abstract form

  • critically interpret the outcomes of statistical analysis in a real-world context

  • relate underlying theory to requirements in practical data science

Content

The laws of probability. Discrete and continuous random variables. Expectation, variance and correlation. Conditional and marginal distributions. Common distributions including the normal, binomial and Poisson. Statistical estimation including maximum likelihood. Hypothesis testing and confidence intervals.

Summative assessment

  • Exam: 100% of unit mark.

Moodle page

Please see the Moodle page for this unit for more a more detailed overview on the organisation and expectations for Statistics for Data Science this year.